Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SSR raymarching performance #99693

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Flarkk
Copy link

@Flarkk Flarkk commented Nov 25, 2024

This PR brings a major rewrite of the Screen Space Reflection raymarching code, targetting performance optimization :

  • Implements a DDA algorithm that marches the ray simultaneously in ndc and homogeneous view space, as described in "Efficient GPU Screen-Space Ray Tracing" (Morgan McGuire and al.).

  • Produces a linear depth buffer during the scale pre-pass (this was actually already the case for the single-eye setup, but not for VR). In conjunction with homogeneous view space marching, this removes the need for any reprojection in the ray marching loop.

  • Removes normal-roughness buffer fetches during marching, utilized to perform backface culling. This is now performed by comparing the current and previous samples' depth and rejecting hits when the ray exits the volume.

  • Solves 2 issues :

    • rays didn't pass behind objects resulting in long hollow trails (see captures below). These are now gone and in the worst case replaced by smaller holes corresponding to the object footprint on the surface behind
    • SSR code doesn't depend anymore on any camera attributes reconstruction method (like Projection::get_z_far() or Projection::is_orthogonal()). These can break under certain circumstances, typically when the zfar / znear ratio is very large, the projection matrix becomes infinite and it's not possible to extract zfar anymore from it
  • Hopefully improves code readability and establishes a good foundation for further improvements. A few ideas I leave to further PRs :

    • Ray striding and jittering, as described in the above paper. This allows marching farther with the same number of samples without deteriorating image quality too much
    • Add a binary search pass to refine the hit points as suggested here
    • Hierarchical-Z approaches involving pre-computed z-buffer mip-maps to speed up marching even further

Visual differences

Cube roughness is 0.2, floor roughness is 0.0.
Depth threshold is 0.1.
Raymarching 512 steps.

This is the single-eye case. Any help to test it in VR is welcome.
Also, any test with more complex scenes would be appreciated.

Before With this PR
Capture d’écran du 2024-11-25 15-14-25 image

Performance improvements

These should be material for both single-eye and VR setups. Although I couldn't get it statistically measured (I couldn't sort out yet the render graph messing up debug markers, despite active support from @clayjohn @Ansraer and @DarioSamo), the GPU traces below show clues of a ~20% compute time reduction.

In this context, please take the below with a grain of salt as it is my interpretation (also please ignore the markers) :

  1. might be the scale pass, left pretty much unchanged in single-eye setup
  2. is likely the core ray marching logic, optimized by ~50% after this PR
  3. looks like the filter pass, although it's pretty much left untouched by this PR and I don't get why it would be longer

Any help on making this analysis stronger is welcome.

Top chart : before
Bottom chart : with this PR

image

@hsvfan-jan
Copy link

The white highlight beneath the cube seems to be accentuated by the new algorithm. Is it possible to turn that down to a similar level as before where it wasn't as obvious? Perhaps the reflection is just mistakenly offset by a few pixels and that's what's causing the highlight look

@Flarkk
Copy link
Author

Flarkk commented Nov 26, 2024

Just fixed the white contact line.
Backface culling logic was rejecting all hits on the first marching iteration, hence the missed reflections close to contact areas.
This is not the case anymore now.

@RPicster
Copy link
Contributor

RPicster commented Nov 26, 2024

I gave it a test and it doesn't improve perceived quality for me and the PR introduced new artifacts. The artifacts are mostly visible at the dark side of the cube.
image
image

@Flarkk
Copy link
Author

Flarkk commented Nov 26, 2024

@RPicster thanks for testing. Can you share the project files ?
The artifacts might be related to the Depth tolerance parameter in Environment being used slightly differently in the new implementation. Will see how I can make it work exactly as before to prevent users having to adjust it.

Also this PR is mostly focused on performance, so don't expect quality improvements in most cases.

@Flarkk Flarkk changed the title Improve SSR raymarching performance and quality Improve SSR raymarching performance Nov 26, 2024
@RPicster
Copy link
Contributor

ssr-test.zip
I adjusted it a bit, I hope it helps

@mrjustaguy
Copy link
Contributor

I think the issue you're seeing is actually from a quality improvement of this PR, resulting in more accurate reflections, while current SSR would tend to elongate things, filling those gaps you're seeing.

The effect can be seen in the before and after screen shots of the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants